Portuguese Language Processing Service

نویسندگان

  • Eraldo R. Fernandes
  • Ruy L. Milidiú
  • Cícero N. dos Santos
چکیده

Current Natural Language Processing tools provide shallow semantics for textual data. These kind of knowledge could be used in the Semantic Web. In this paper, we describe F-EXT-WS, a Portuguese Language Processing Service that is now available at the Web. The first version of this service provides Part-of-Speech Tagging, Noun Phrase Chunking and Named Entity Recognition. All these tools were built with the Entropy Guided Transformation Learning algorithm, a state-of-the-art Machine Learning algorithm for such tasks. We show the service architecture and interface. We also report on some experiments to evaluate the system’s performance. The service is fast and reliable.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The Presence and Influence of English in the Portuguese Financial Media

As the lingua franca of the 21st century, English has become the main language for intercultural communication for those wanting to embrace globalization. In Portugal, it is the second language of most public and private domains influencing its culture and discourses. Language contact situations transform languages by the incorporations they make from other languages and Portugal has...

متن کامل

Real-Time Open-Domain QA on the Portuguese Web

This paper presents a system for real-time, open-domain question answering on the Web of documents written in Portuguese, prepared to handle factual questions and available as a freely accessible online service. In order to deliver candidate answers to input questions phrased in Portuguese, this system resorts to a number of shallow processing tools and question answering techniques that are sp...

متن کامل

Corpora at Linguateca: Vision and roads taken

In the late nineties, access to Portuguese data in electronic form was scarce, and was considered one of the bottlenecks limiting the advance of natural language processing of Portuguese (Santos, 1999a), so Linguateca’s launching of AC/DC i had as purpose to significantly increase the amount of data – and its quality, in that the data was annotated and classified. To the best of my knowledge, A...

متن کامل

LX-Service: Web Services of Language Technology for Portuguese

In the present paper we report on the development of a cluster of web services of language technology for Portuguese that we named as LXService. These web services permit the direct interaction of client applications with language processing tools via the Internet. This way of making available language technology was motivated by the need of its integration in an eLearning environment. In parti...

متن کامل

Supporting FrameNet Project with Semantic Web Technologies

FrameNet Project is being developed by ICSI at Berkeley, with the goal of documenting the English language lexicon based on Frame Semantics. For Brazilian Portuguese, the FrameNet-Br Project, hosted at UFJF, follows the same theoretical and methodological perspective. This work presents a service-based infrastructure that combines Semantic Web technologies with FrameNet-like databases, by consi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2009